Numpy

Fundamental building block of scientific Python.

  • Main attraction: Powerful and highly flexible array object; your new ubiquitous working unit.
  • Set of most common mathematical utilities (constants, random numbers, linear algebra functions).

Import


In [ ]:
# imports
import numpy as np                 # It will be used a lot, so the shorthand is helpful.
import matplotlib.pyplot as plt    # Same here.
%matplotlib inline

# these can be useful if you plan on using the respective functions a lot:
np.random.seed(42)                 # Seeding is important to replicate results when using random numbers.
rnd = np.random.random

sin = np.sin                       # Be careful to no write "sin = np.sin()"! Why?
cos = np.cos

RAD2DEG = 180.0/np.pi              # Constants for quick conversion between radians (used by sin/cos) and degree
DEG2RAD = np.pi/180.0

Numpy array basics

Every numpy array has some basic values that denote its format. Note that numpy array cannot change their size once they are created, but they can change their shape, i.e., an array will always hold the same number of elements, but their organization into rows and columns may change as desired.

  • ndarray.ndim: The number of axes/dimensions of an array. The default matrix used for math problems is of dimensionality 2.
  • ndarray.shape: A tuple of integers indicating the size of an array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the rank, or number of dimensions, ndim.
  • ndarray.size: The total number of elements of the array. This is equal to the product of the elements of shape.
  • ndarray.dtype: The data type of the array elements. Defaults to 64 bit floating point values and can be set when the array is created.

(see: Numpy basics)


In [ ]:
m = np.array([[1,2,3],
              [4,5,6],
              [7,8,9]], dtype=np.int32) # np.float32, np.float64, np.complex64, np.complex128
print m
print 'ndim: ', m.ndim, '\nshape:', m.shape, '\nsize: ', m.size, '\ndtype:', m.dtype

Under the hood

  • Numpy arrays believe in sharing is caring and will share their data with other arrays. Slicing does NOT return a new array, but instead a view on the data of another array:

In [ ]:
s = m[1]
print 'BEFORE'
print s, 'slice', '\n'
print m, '\n'
s[0] = 0
print 'AFTER'
print s, 'slice' '\n'
print m, '\n'
  • You can check whether an array actually owns its data by looking at its flags (you should understand both differences in the two flag settings):

In [ ]:
print m.flags, '\n'
print s.flags

Array creation


In [ ]:
# helper function for examples below; plots the graphical depiction of a given numpy array
def showMatrix(X):
    Y = np.array(np.array(X, ndmin=2))  # 1D -> 2D
    vmin = min(np.min(Y), 0)
    vmax = max(np.max(Y), 1)
    plt.imshow(Y, interpolation='none', vmin=vmin, vmax=vmax, cmap=plt.cm.get_cmap('Blues'))

In [ ]:
Z = np.zeros(9)
showMatrix(Z)

In [ ]:
Z = np.zeros((5,9))
showMatrix(Z)

In [ ]:
Z = np.ones(9)
showMatrix(Z)

In [ ]:
Z = np.ones((5,9))
showMatrix(Z)

In [ ]:
Z = np.array( [0,0,0,0,0,0,0,0,0] )
showMatrix(Z)

In [ ]:
Z = np.array( [[0,0,0,0,0,0,0,0,0],
               [0,0,0,0,0,0,0,0,0],
               [0,0,0,0,0,0,0,0,0],
               [0,0,0,0,0,0,0,0,0],
               [0,0,0,0,0,0,0,0,0]] )
showMatrix(Z)

In [ ]:
Z = np.arange(9)    # the numpy arange function also allows floating point arguments
showMatrix(Z)

(see also: linspace)


In [ ]:
Z = np.arange(5*9).reshape(5,9)
showMatrix(Z)
  • Reshape must not change the number of elements within the array.
  • A vector of length n and a matrix of dimensions (1,n) ARE NOT THE SAME THING!

In [ ]:
Z = np.random.uniform(0,1,9)  # args: min, max, no. of elements
showMatrix(Z)

In [ ]:
Z = np.random.uniform(0, 1, (5, 9))
showMatrix(Z)

Array slicing


In [ ]:
# single element
Z = np.zeros((5, 9))
Z[1,1] = 1
showMatrix(Z)

In [ ]:
# single row
Z = np.zeros((5, 9))
Z[1,:] = 1
showMatrix(Z)

In [ ]:
# single column
Z = np.zeros((5, 9))
Z[:,1] = 1
showMatrix(Z)

In [ ]:
# specific area
Z = np.zeros((5, 9))
Z[2:4,2:6] = 1            # for each dimension format is always: <from:to:step> (with step being optional)
showMatrix(Z)

In [ ]:
# every second column
Z = np.zeros((5, 9))
Z[:,::2] = 1              # for each dimension format is always: <from:to:step> (with step being optional)
showMatrix(Z)

In [ ]:
# indices can be negative
Z = np.arange(10)
print ">>> Z[-1]:  ", Z[-1]       # start indexing at the back
print ">>> Z[3:-3]:", Z[3:-3]     # slice of array center
print ">>> Z[::-1]:", Z[::-1]     # quickly reverse an array

Broadcasting

Arithmetic operations applied to two Numpy arrays of different dimensions leads to 'broadcasting', i.e., filling up the missing values to allow the operation if possible. This includes:

  • Adding/subtracting/etc. a single value to a matrix.
  • Adding/subtracting/etc. a column/row vector to a matrix.
  • Adding/subtracting/etc. a column and a row vector.

NOTE: Multiplying with * WILL ALSO BE APPLIED elementwise! Use np.dot() for actual matrix multiplication!

FUN FACT: Truth value checks will also applied elementwise.

(see: Numpy broadcasting)

Exercises

  1. Select a tile-pattern subset of a 5x9 matrix like this:
  2. ..and like this:
  3. ..and also like this:
  4. Adapt the code for No.3 so that it works with arrays of arbitrary dimensions.
  5. Write the code that perfoms the operation depicted below (source). Parameterize your code and use the above utility function to plot the final matrix in dimensions 8x2 and 256x64.
  6. Write a function that subtracts the mean from a given matrix (arbitrary dimensions).
  7. Write a function that gradually weighs the rows of a given matrix from top to bottom (arbitrary dimensions).
  8. Write one line that checks whether there are any values smaller than 0 within a given array.
  9. Create a two dimensional array containing the values 0..9.
    1. Reverse the order of the rows of the matrix using a single slice.
    2. Reverse the order of the columns of the matrix using a single slice.
    3. Reverse the order of both the rows and the columns of the matrix using a single slice.
  10. Check the documentation: What is the difference between np.max() and np.nanmax()?
    1. Think of two cases where it would be important to use one over the other!
    2. Explain how you can find both functions using only the numpy documentation itself.

In [ ]:
#-#-# EXC_NUMPY: YOUR CODE HERE #-#-#